Genotype-Frequency Estimation from High-Throughput Sequencing Data.
نویسندگان
چکیده
Rapidly improving high-throughput sequencing technologies provide unprecedented opportunities for carrying out population-genomic studies with various organisms. To take full advantage of these methods, it is essential to correctly estimate allele and genotype frequencies, and here we present a maximum-likelihood method that accomplishes these tasks. The proposed method fully accounts for uncertainties resulting from sequencing errors and biparental chromosome sampling and yields essentially unbiased estimates with minimal sampling variances with moderately high depths of coverage regardless of a mating system and structure of the population. Moreover, we have developed statistical tests for examining the significance of polymorphisms and their genotypic deviations from Hardy-Weinberg equilibrium. We examine the performance of the proposed method by computer simulations and apply it to low-coverage human data generated by high-throughput sequencing. The results show that the proposed method improves our ability to carry out population-genomic analyses in important ways. The software package of the proposed method is freely available from https://github.com/Takahiro-Maruki/Package-GFE.
منابع مشابه
Accounting for genotype uncertainty in the estimation of 2 allele frequencies in autopolyploids 3 Paul
12 Despite the ever increasing opportunity to collect large-scale datasets for population genomic analyses, 13 the use of high throughput sequencing to study populations of polyploids has seen little application. 14 This is due in large part to problems associated with determining allele copy number in the genotypes 15 of polyploid individuals (allelic dosage uncertainty–ADU), which complicates...
متن کاملAccounting for genotype uncertainty in the estimation of 2 allele frequencies in autopolyploids
12 Despite the ever increasing opportunity to collect large-scale datasets for population genomic analyses, 13 the use of high throughput sequencing to study populations of polyploids has seen little application. 14 This is due in large part to problems associated with determining allele copy number in the genotypes 15 of polyploid individuals (allelic dosage uncertainty–ADU), which complicates...
متن کاملAccounting for genotype uncertainty in the estimation of allele frequencies in autopolyploids.
Despite the increasing opportunity to collect large-scale data sets for population genomic analyses, the use of high-throughput sequencing to study populations of polyploids has seen little application. This is due in large part to problems associated with determining allele copy number in the genotypes of polyploid individuals (allelic dosage uncertainty-ADU), which complicates the calculation...
متن کاملSNP genotyping and parameter estimation in polyploids using low-coverage sequencing data
Motivation Genotyping and parameter estimation using high throughput sequencing data are everyday tasks for population geneticists, but methods developed for diploids are typically not applicable to polyploid taxa. This is due to their duplicated chromosomes, as well as the complex patterns of allelic exchange that often accompany whole genome duplication (WGD) events. For WGDs within a single ...
متن کاملGenotype Calling from Population-Genomic Sequencing Data
Genotype calling plays important roles in population-genomic studies, which have been greatly accelerated by sequencing technologies. To take full advantage of the resultant information, we have developed maximum-likelihood (ML) methods for calling genotypes from high-throughput sequencing data. As the statistical uncertainties associated with sequencing data depend on depths of coverage, we ha...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Genetics
دوره 201 2 شماره
صفحات -
تاریخ انتشار 2015